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GENE SEQUENCES OF RUBELLA VIRUS ASSOCIATED WITH ATTENUATION 



Background of the Invention 



Rubella virus is the causative agent of German 
measles, a viral infection associated with a mild fever and 
5 rash. The most serious complications of rubella occur 
during pregnancy due to transplacental passage of the virus 
to the fetus resulting in the widespread manifestations of 
congenital rubella. These include fetal loss, or 

multisystem defects m the newborn such as cataracts, 
10 deafness, cardiac abnormalities and microcephaly. 

To prevent congenital infection, a universal 
vaccination scheme for all children around 15 months of age 
was implemented in North America in 1969, using attenuated 
vaccines which had recently been developed. While reducing 

15 the level of rubella circulating in the community, 
vaccination of young children did not significantly alter 
the proportion of women entering their childbearing years 
without protective levels of circulating antibody 
reported to be around 10-15%. This population was 

20 therefore also targeted for vaccination. 

Vaccination reduced the incidence of congenital 
rubella but was found to be associated with a number of 
sequelae, particularly in women over 25 years of age. 
Symptoms included arthritis, neurological manifestations 

25 and chronic fatigue. The most notable complication of 
rubella immunisation was arthritis which has also 
frequently been documented as a consequence of natural 
rubella. The joint symptoms induced can be severe in the 
acute stage but usually resolve without causing permanent 

30 joint damage. Occasionally, however, chronic or recurrent 
arthritis develops which can persist for many months or 
years in certain individuals (Ford et al . , 1988) 

Several vaccines have been used in North America since 
1969. These include two variants of the HPV77 strain 
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originally produced by Dr. Meyer from the wild strain 

M33 by multiple passages m monkey kidney cells (Meyer 
et al . . 1965/. The HPV77 strain was further attenuated by 
a further 5 passages in duck embryo cells (to give tne 
5 H?V77/DE5 strain) or by 12 passages in dog kidney cells (to 
give tne HPV77/DK12 strain; , The KPV77/DK12 vaccine proved 
to be too reactogenic even in children and was soon removed 
from distribution. The KPV77/DS5 vaccine was used as part 
of the M-M-R: vaccine (measles /mumps/rubella combined 
10 vaccine; Merck Sharp Dohme ; West Point, Pa. U.S.A.) until 
1979 when it was replaced in the M-M-RII vaccine by the 
RA27/3 strain (Plotkin and Buser, 1985), which is the 
current vaccine strain used in North America. 

The Cendehill strain (Peetermans Huygelen, 1967) was 
15 developed in Belgium and was the predominant strain used in 
vaccine production in Europe until 1989. The Cendehill 
strain is reported to be associated with a decreased 
incidence of complications in the adult female population 
in a comparative study of five vaccines. Best et al . 
20 (1974) reported that acute arthritis occurred in only 3% of 
individuals immunised with Cendehill vaccine but in 17% of 
those receiving RA27/3. Moreover the symptoms with RA27/3 
were also more prolonged. The disadvantage of the 
Cendehill vaccine was that the mean titre of HAI antibody 
25 induced in vaccine recipients was lower than that obtained 
with the RA27/3 strain indicating that Cendehill is less 
immunogenic . 

A close correlation has been found between the ability 
of a given strain of rubella virus to infect and persist in 

30 human joint tissue in culture and its association with the 
induction of arthropathy in vivo , suggesting that tropism 
for joint tissue is an important determinant of the ability 
to induce joint symptoms (arthritogenicity) . As reported 
in Miki and Chantler (1992) , wild-type strains (Therien and 

35 M33) were found to grow to high titres of lO^-lo'^ pfu/ml m 
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the medium of either cell cultures or organ cultures 
derived from human joint tissue. In contrast the RA27/3 
strain was considerably restricted for growth giving yields 
of lO-^-lO*^ pfu/ml and the Cendehill strain showed no grC'Wth 
5 at all. These results correlate with the known 

associations of rubella strains and joint symptoms in vl vo . 



Rubella virus is a small (60-70 nm) enveloped 
togavirus, the sole member of the genus Rubivirus . It has 
a single- stranded RNA genome approximately lOkb in size. 

10 The genomic RNA is posit ive - stranded which means that it 
can act as mRNA within the infected cell. The sequence of 
the entire genome has been determined for two wild-type 
strains Therien and M33 (Dominguez et al . , 1990; Gillam 
et al ■ . 1993, Genbank No. X72393), and the RA27/3 vaccine 

15 strain (Pugachev et al . , 1997) . The genome contains two 
large open-reading frames (ORF's) which code for the 
structural proteins (3' proximal 3189 nucleotides) and 
non-structural proteins (5' proximal 6345 nucleotides). 
The current understanding is that the open-reading frames 

20 for the structural and the non- structural proteins are 
separated by a region of about 123 nucleotides. 



The infected cell contains two virus - induced 
positive-strand RNA species, the genomic RNA {4 0s; lOkb) 
and a sub-genomic mRNA (26s; 3kb) which encodes the major 

25 ORF for the structural proteins. The ORF for structural 
proteins is translated into a llOkd polyprotein and is 
subsequently cleaved by cellular signal peptidase into the 
three structural viral proteins, El, E2 , and C. The order 
of structural genes was originally determined by 

30 synchronised translation as being NH2-C-E2-EI -COOH , which 
was confirmed by sequence analysis of cDNA clones of the 
subgenomic mRNA (Clarke et al . . 1987; Frey Sc Marr, 1988 ; 
and Zheng et al . , 1989) . 
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The ncn- St ructural (NS' genes are translated froTT. tne 
full-length genomic RNA as a >200kD polyprotem which is 
subsequently cleaved into two non - structural proteins, pl50 
and p5C. These comprise the enz\mr\es required for viral 
E replication in the cell. Pritein pl50, nearest the 
5' terminus, is 11- CC aminc'-acids m length and encodes the 
putative m.ethyltransf erase lunctic^n and the viral protease. 
Protein p90 is 905 anino-acids long and has regions cf 
homology with global helicase and replicase domains. 

C: Summary of the Invention 

This invention provides nucleic acids (DNA or RNA) 
comprising one or more sequences of nucleotides 
cc-rrespcnding to all or part of the genome of the Cendehill 
strain of rubella virus. Nucleic acids of this invention 

5 miay encode an infectious virus cf the Cendehill strain or 
one having an attenuated phenotype equivalent to Cendehill 
strain. DNA of this invention may be in a plasmid or viral 
vector which enables replication and/or transcription of 
the Cendehill cDNA and is referred to herein as a Cendehill 

C infectious clone. The infectious clone may be used to 
produce a DNA vaccine for rubella virus. 

This invention also provides a nucleic acid {DNA or 
RNA) comprising a sequence cf nucleotides that includes a 
first portion corresponding to one or more of the 

5 non- translated regions, pl50, p90, C, E2 and El gene 
regions of Cendehill strain and a second portion that is 
derived from another rubella virus strain such that the 
product encodes a novel infectious chimeric rubella virus 
strain. DNA of this invention may be in a plasmid or viral 

0 vector forming an infectious clone. 

This invention also provides a chimeric 
Cendehill/RA27/3 clone whose genome includes a first 
portion corresponding to the Cendehill 5' non - translated 
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RNA, Cendehill plBO and p90 and wherein a second portion 
corresponds to the structural gene region and the 3' 
non- translated region of RA27/3 strain. This clone can be 
used to produce a chimeric virus that expresses the 
5 structural proteins of RA27/3 but has the genetic structure 
at the 5 'end and in the non-structural genes of Cendehill 
strain that determine the non-arthrotropic nature of this 
strain . 



This invention also provides RNA encoding the entire 
C genome of Cendehill or the Cendehill/RA27/3 chimera or a 
fragment thereof, by transcribing the aforementioned DNA. 
This invention also provides rubella virus produced by 
transcribing the DNA, transfecting cells with the RNA so 
derived, and recovering virus from cells so transfected. 

5 This invention also provides a nucleic acid encoding 

one or more Cendehill strain rubella virus proteins 
selected from the group consisting of: plBO, p90, C, El and 
E2 , or wherein the nucleic acid corresponds to a 
non-translated region of the Cendehill genome. The nucleic 

0 acid may be DNA or RNA and may be incorporated into a 
plasmid or viral vector for expression of protein. 

This invention also provides a method of producing 
Cendehill viral protein comprising the steps of expressing 
a DNA sequence encoding a protein corresponding to 

25 Cendehill protein pl50, p90, C, E2 or El in a cell by means 
of a suitable expression vector and recovering the protein 
so expressed. The protein may be a Cendehill protein 
having a sequence corresponding to a portion of the cDNA 
sequence in Appendix 1 or the protein may be altered by 

30 modification of the Cendehill cDNA, as described herein. 

This invention also provides a method of producing a 
recombinant DNA encoding a mutated or chimeric rubella 
virus exhibiting the lack of arthrotropicity of the 
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Cendehill strair. but v;ith additional advantageous 
properties that include, but are not restricted to, 
increased immunogeniciiy or stability of another rubella 
strain . 



5 This method comprises steps whereby; 

(a) nucleotiaes in Cendehill cDNA encoding viral 
structural proteins are altered such that the protein so 
encoded increases the immunogenicity or stability of a 
recombinant rubella virus comprising said protein; or 

10 (b) nucleotides in the non- translated regions or 

non-structural gene region of cDNA for rubella virus other 
than Cendehill are altered to decrease arthri togenici ty of 
a recombinant rubella virus coded for by the altered cDNA . 

cDNA from steps (a) or (b) , may be incorporated into a 
15 plasmid or viral vector to produce an infectious clone, 
from which RNA may be transcribed and transfected into 
cells to provide virus that may be used as a recombinant 
rubella vaccine. Alternatively, cDNA from (a) or (b) in a 
suitable vector may be used as a DNA vaccine. 



20 This invention also provides a rubella virus whose 

genetic material comprises a first portion corresponding to 
one or more RNA sequences selected from the group 
consisting of: Cendehill non- translated RNA, Cendehill 
pl50, p90, C, El and E2 RNA; and wherein a second portion 

2 5 of the genome corresponds to RNA of a rubella virus other 
than Cendehill. 



This invention also provides a Cendehill viral protein 
free of virus, selected from the group consisting of: pl50, 
p90, C, El and E2 , produced by expressing Cendehill cDNA 
3 0 encoding said protein from an expression vector. 
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This invention also provides rubella cDNA, RNA or a 
rubella virus having one or more of the Cendehill 
strain-specific nucleotides selected from a group 
consisting of: 37-C, 55-G, 118'T(or U) , 358-C, 2829-A, 
5 3060-G, 3164-C, 3528-T (or U) , 4530-T (or U) , 6611-C, 6770- 
G. 6771-G. 7428-T (or U) , 8786-G, 8788-T (or U) , 8864-A, 
9180-T (or U) , 9254-A; and 9741-T (or U) . The aforesaid 
nucleotide numbers are in reference to nucleotides bearing 
the same numbers as shown in Appendix 1 for Cendehill. 

10 cDNA, RNA or virus of this invention may have the 
strain-specific nucleotide at a different nucleotide 
position number as compared to Cendehill, providing the 
context of the strain- specif ic nucleotide is the same as 
for Cendehill. In this instance, context defines the five 

15 nucleotides on either side of the strain- specif i c 
nucleotide in Cendehill. 



This invention also provides a Cendehill cDNA, and 
genomic RNA that encodes a rubella virus protein selected 
from the group of proteins pl50, p90, C, El and E2 and with 

20 one or more Cendehill strain- specif ic amino-acids defined 
as pl5 0/92 9/tyr , pi 5 0 / 1 0 0 6 /g ly , pl5 0 / 104 l/his , 
pl50/1162/val, p90/l496/ile, C/ 34/pro, C/87/gly, 
E2/306/val, E2/413/ile, El/759/asp, El/785/met/, 
El/890/leu, and El/915/thr. The aforesaid strain- specif ic 

25 amino acids are identified by protein name, amino-acid 
position within the Cendehill rubella polyprotein, and the 
identity of an amino-acid at such a position. Such 
proteins of this invention include proteins having the 
strain-specific amino acid at a different amino acid 

30 position number in the protein as compared to Cendehill 
providing the context of the strain-specific amino acid is 
the same as for Cendehill. In this instance, context is 
defined as including the three amino acids to either side 
of the strain-specific amino acid in Cendehill. In this 

35 specification, reference to a strain- specif ic amino acid 
such as pl50/929/tyr will be used to identify the amino 
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acid as v;ell as a protein (.eg. pl5C:'^ contairiinc the 
strain-specific amine acid, in context as described herein. 

This invention also provides a nucleic acid (eg. DNA) 
for the first non- translated region (NTR) and first sterr. 
5 loop (nucleotides 1 to 65) equivalent to that found in the 
Cendehill strain and characterised as being a majcr 
determinant of growth restriction in joint tissue. 
Specific characteristics cf this stem loop in Cendehill 
include two nucleotide changes from the wild-type Therien 

10 strain, a U to C at nucleotide 37 in the predicted terminal 
loop that alters the size cf the loop from 6 to 
11 nucleotides, and an A to G at nucleotide 55 that 
increases the size of the predicted medial loop from 
6-10 nucleotides. These two nucleotide changes at these 

15 positions and in the context found in Cendehill strain 
(defining the five nucleotides on either side of each 
nucleotide) are determinants of arthrotropism . Other 
mutations between nucleotides 20-28 and 52-60 that either 
increase or decrease the predicted size of the medial loop 

20 are included within the scope of this invention. Similarly 
any mutation that alters the predicted size of the terminal 
loop and alters the phenotypic characteristics of the virus 
are within the scope of this invention. Factors that 
define the determinants of joint cell restriction include 

25 sequence-specific changes in the medial or terminal loop or 
changes that alter the size of either or both of the loops. 
These regions include nucleotides 20-28, 33-43 and 52-60. 



Appendix 1 sets out the sequence of cDNA representing 
the Cendehill genome. Location of the various 

30 non- translated regions and coding regions are shown. Two 
polyproteins are encoded, beginning at the start codons 
indicated for pl50 and the C protein, respectively. The 



BNSDOCID <WC 9961637A1 I > 



wo 99/61637 PCT/CA99/00479 

amino acid sequence of each polyprotem and the respective 
structural and ncn- structural proteins may be determined 
from -he nucleotide sequence of Appendix 1. In this 
specification, the location of an amiino-acid will be given 
5 by reference to a residue number of a polyprotein, which 
residue number may be determined directly from the series 
of codons shown in Appendix 1 commencing at one or the 
other of the start codons. 

The term "corresponding" as used in this specification 

10 means that when a nucleic acid, peptide or protein is 
described by reference to a specified nucleic acid, peptide 
or protein, the nucleic acid, peptide or protein so 
described may include a nucleotide or amino acid sequence 
which differs from the sequence of the specified nucleic 

15 acid, peptide or protein. Corresponding nucleic acids, 
polypeptides or proteins will include sequences of 
differing length or which differ by one or more 
substitutions, additions or deletions. Nucleic acids, 
peptides and proteins of this invention include fragments 

20 of specified nucleic acids, peptides or proteins and may 
include additional amino acid or nucleotide sequences from 
that specified. Furthermore, corresponding nucleic acids 
include complementary nucleic acids, meaning those nucleic 
acids capable of base pairing with a specified nucleic 

25 acid. Nucleic acids having sequences which differ from the 
sequence of a specified nucleic acid due to degeneracy of 
the genetic code are also included within the meaning of 
the term "corresponding". Further, nucleic acids which 
encode peptides or proteins in which there are conservative 
30 substitutions, additions or deletions as compared to a 
specified peptide or protein are included. Any and all 
such nucleotide variations and resulting amino acid 
polymorphisms which provide the advantages of this 
invention as described herein are within the scope of this 
3 5 invention. 
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Nucleic acids within rhe scope of chis invencion xay 
contain linkers ^ modified or unT.odified restriction 
endonuclease sites and other sequences of nucleotides 
useful for cloning, expression, or purification. Nucleic 
acids within the scope O'f this invention may be 
incorporated m a larger sequence of nucleotides, including 
oiasmids and vectors useful for manipulation or expression 
of nucleic acids. 



One measure of "correspondence" of nucleic acids, 
0 peptides or proteins with respect to this invention is 
relative "identity" between sequences. In the case of 
peptides or proteinS; or m the case of nucleic acids 
defined according to a encoded peptide or protein 
correspondence includes a peptide having at least about 50% 
5 identity, more preferably at least about 70% identity, even 
more preferably at least about 90% identity, even more 
preferably at least about 95% and most preferably at least 
about 98-99% identity to a specified peptide or protein. 
Preferred measures of identity as between nucleic acids is 
Q the same as specified above for peptides with at least 
about 90% or at least about 98-99% identity being more or 
most preferable. 

The term "identity" as used herein refers to the 
measure of identity of sequence between two peptides or 

5 between two nucleic acid molecules. Identity can be 
determined by comparing a position in each sequence which 
may be a line for purposes of comparison. Two amino acid 
or nucleic acid sequences are considered substantially 
identical if they share at least about 75% sequence 

0 identity, preferably at least about 90% sequence identity, 
even more preferably at least 95% sequence identity and 
most preferably at least about 98-99% identity. 



Sequence identity may be determined by the BLAST 
algorithm described in Altschul et al . (1990) J. Mol . Biol. 
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215:403-410, using the published default settings. When a 
position in the compared sequence is occupied by the same 
base or amino acid, the molecules are considered to have 
shared identity at that position. The degree of identity 
5 between sequences is a function of the number of matching 
positions shared by the sequences. 

An alternate measure of identity of nucleic acid 
sequences is to determine whether two sequences hybridize 
to each other under low stringency, and preferably high 
0 stringency conditions. Such sequences are substantially 
identical when they will hybridize under high stringency 
conditions. Hybridization to filter-bound sequences under 
low stringency conditions may, for example, be performed in 
0.5 M NaHPO^, 7% sodium dodecyl sulfate (SDS) , 1 mM EDTA at 
5 65°C, and washing in 0 . 2 x SSC/0.1 SDS at 42°C (see Ausubel 
et al . (eds.) 1989, Current Protocols in Molecular Biology. 
Vol. 1, Green Publishing Associates, Inc., and John Wiley 
& Sons, Inc., New York, at p. 2.10.3). Alternatively, 
hybridization to filter-bound sequences under high 
0 stringency conditions, may for example, be performed in 
0.5 M NaHPO^, 7% SDS, 1 mM EDTA at 6 5°C, and washing in 
0.1 X SSC/0.1% SDS at 68°C (see Ausubel et al . (eds) , 1989, 
supra ) . Hybridization conditions may be modified in 
accordance with known methods depending on the sequence of 

5 interest (see Tijssen, 1993, Laboratory Techniques in 

Biochemistry and Molecular Biology -- Hybridiz ation with 
Nucleic Acid Probes . Part I, Chapter 2 "Overview of 
Principles in Hybridization and the Strategy of Nucleic 
Acid Probe Assays", Elsevier, New York). Generally, 
,0 stringent conditions are selected to be about 5°C lower 
than the thermal melting point for the specific sequence at 
Lonic strength and pH . 



5 



Nucleic acids of this invention will preferably 
exhibit substantial identity to Cendehill, with respect to 
the regions of the Cendehill genome described herein which 
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relare the arthrctropic phenc"C\^e of Cendehill. More 

preferably^ such regions will have at least abcu*i 96% 
identity. Most preferably, tnere v;ill be complete identity 
in the "context" of Cendehill strain-specific nucleotides 
5 or amino acids, as "context" is described herein. 

With reference to nucleic acids corresponding to the 
first 5 ' NTR of Cendehill, such correspondence may be 
determined by predicting the folded structure of the region 
rather than by measuring sequence identity. Nucleic acids 

10 cf this invention include a 5 ' NTR having a folded structure 
in which one or both of the terminal and medial loops is 
altered in size as compared to wild-type. The size of the 
loop may be quantified according to the number of un-paired 
bases in the loop region. Preferably, such alterations 

15 result in an increase in size of the loop as compared to 
wild-type. More preferably, such altered loops will be of 
at least the size of the terminal and medial loops 
described herein for Cendehill. Most preferably, the 
sequence of un-paired bases in either loop region will be 

20 substantially the same as described herein for Cendehill 
loops. Further, nucleic acids of this invention comprising 
a 5'NTR, may include a bulge which is increased in size as 
compared to the wild-type bulge and preferably will have at 
least four un-paired bases in a bulge to one side of the 

25 stem structure. Most preferably, the sequence of un-paired 
bases in such a bulge will be substantially as described 
herein for the Cendehill bulge. Determination of predicted 
folding of a 5 ' NTR is carried out as described herein using 
the Mfold™ 3.0 program. 

30 Variation in the immuncgenici ty , yield, stability or 

pathogenicity of the product may readily be determined by 
standard techniques by comparison to known strains such as 
Cendehill. For example, mutation of Cendehill to increase 
antigenicity may be determined by measuring increased 

3 5 binding of a virus or viral protein to a known antibody to 
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rubella virus and comparing this binding tc that of 
Cendehill virus or prc'tem at an equivalent concentration. 

Arthrotropism, for the purpose of this specification, 
is defined as the ability of a rubella virus strain to 
5 replicate m pieces of human joint tissue weighing 
approximately 0.1 gram cultured in 2 mis of medium and 
yield virus of titres greater than 100 plaque - forming units 
per ml of medium, at 24 hours post - infect ion that increases 
over the next 24 tc. 4 8 hours. Any virus less than 

10 100 pfu/cell and that does not show an increase in titre 
represents residual virus from the inoculum. Following a 
period to allow adsorption of virus in the inoculum to the 
cells (4 hours) , the joint pieces are washed 4 to 5 times 
to reduce this residual virus and characteristically 

15 10-100 pfu/ml of virus remains after this procedure. 

This invention also provides a method for constructing 
chimeric rubella viral strains comprising part Cendehill 
and part of a second rubella strain including steps 
whereby : 

20 (a) cDNA for one or more of the Cendehill 

non-translated regions, non- structural proteins pl50 and 
p90 and structural proteins C, E2 and El is joined to cDNA 
of a rubella virus other than Cendehill to produce DNA 
corresponding to a complete RNA genome of a chimeric 

25 rubella virus. This may also be incorporated into a plasmid 
or viral vector to provide a chimeric infectious clone. 

(b) the resulting altered cDNA clone may be 
transcribed to produce RNA which may be used to transfect 
cells to produce chimeric virus, which can be cultivated as 
30 a seed stock for vaccine production. 

This invention also provides rubella cDNA, RNA, or 
virus wherein cDNA or RNA encoding one or more of the viral 
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pi 5 0 or p?0 prctems cr zhe cDKA or RNA corresponding rc a 
5' non- translated region is derived from cr is T.ucaced to 
correspond oo Cendehill, and at least part of the DNA, RNA 
or viral RNA, is derived frorr. cr is mutatea to corresTocnd 
5 zo rubella other than Cendehill . Preferably^ the cDNA, RNA 
or genome of the virus will have one or more subst itut ic^ns 
cr deletions (as compared with Therien strain) in or near 
the 5' non- translated region in the areas cf nucleotides 
17-65; substitutions in the non- structural gene coding 
IG region resulting in one or more mutations cf amino acids 
929, 1006, 1041, 1162 of plSO protein or amino acid 1496 of 
p9C protein; or, substitutions at or near nucleotides 118 
or 358 of the non- structural gene encoding region. 

This invention also provides the use c-f the 
aforementioned cDNA, RNA, vectors (including infectious 
clones) and viruses (recombinant or chimeric) in the 
production of modified rubella cDNA, RNA or viruses, 
production of modified rubella protein, and in the 
production of rubella vaccines (DNA vaccines, live 
attenuated viral vaccines and subunit vaccines) . 

This invention also provides the entire sequence of 
the Cendehill strain of rubella virus, including the 
identification of nucleotide substitutions relative to 
wild-type strains which are unique to the Cendehill strain 
25 and are associated with the attenuating phenotype . This 
phenotype includes temperature sensitivity and the 
restriction of growth in human joint tissue. These 
substitutions can be incorporated into other rubella 
strains such as the current RA27/3 vaccine to produce new 
30 vaccine strains that are not arthritogenic . Such 
substitutions miay be in the region of nucleotides 17-65 (in 
or near the first 5' non- translated region) which forms a 
stem.- loop structure. The substitutions may be at or near 
nucleotides 118 or 358 of the non- structural gene region, 
35 or the substitutions may involve one or more mutations of 



15 



20 
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amino acids 929, 10C6, 1041, 1162 of pl50 or ammo acid 
1496 of p90- 



This invention also identifies mutations in Cendehill 
virus structural gene regions associated with reduced 
5 immunogenicity of this strain. These include two amino 
acid substitutions in the E2 protein at amino acids 306 and 
413 (ie. at nucleotides 7428 or 7746/47), and four amine* 
acid substitutions in El at amino acids 759, 785, 890 and 
915 (ie. at nucleotides 8786/88; 8864; 9180; or 9254). 
10 Alterations of some or all of these nucleotides to the 
equivalent nucleotides found in a more immunogenic strain 
such as RA27/3 or wild-type, enables production of a 
modified Cendehill strain which would be more antigenic. 
This may also be used as an alternative vaccine. 

15 The infectious clone of Cendehill strain exemplified 

herein and identified as pJCND, comprises a DNA copy of the 
full-length Cendehill viral genome inserted into a vector 
from which RNA transcripts of the genome can be synthesized 
in vitro and which transcripts are infectious when 

20 transfected into cells. In the case of pJCND, the vector 
is the plasmid pCL 1921, which was originally constructed 
by Lerner and Inouye (1990) but modified by incorporation 
of the pUC19 polycloning region (Yanisch- Perron et al . , 
1985) and an SP6 RNA polymerase promoter . This plasmid is 

25 replicated at low copy number (approximately 5 copies per 
cell) and contains a spect inomycin resistance gene. 
Transcription of pJCND or other infectious clones employing 
Cendehill cDNA with a suitable polymerase (eg. SP6 
polymerase for pJCND) enables the production of infectious 

30 Cendehill RNA which can be transfected into cells to yield 
a seed stock for obtaining recombinant rubella virus stocks 
and rubella vaccines. 

Methods for production of infectious clones, 
subsequent expression of RNA, transfection of cells with 
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such RNA and prcducticn cf virus as well as use cf such 
virus in the preparation of rubella vaccines are known, for 
example as described in United States Patent 5,439,814 and 
5,663,065 cf Prey, et al . Suitable expression vectors for 
5 rubella cDNA include those described herein as well as 
others known m the art such as the pSI or pCI mammalian 
expression syscems iPromega) which incorporate the SV40 and 
CMV Immediate Early enhancer /promo ter systems 
(respectively) or bacterial plasmids such as pUClS, pGEM or 
10 PBR-322 (Promega) incorporating a suitable promoter 
sequence such as the SPG promoter. 

Methods for production of suitable expression vectors 
for use in DNA vaccines are also known. For example, cDNA 
derived from this invention may be expressed in pSI or pCI 

15 described above or the vector could be a viral vector 
modified to allow expression of foreign genes. Such 
vectors derived from adenovirus, retrovirus, alphavirus, or 
vaccinia virus are frequently modified to make them 
non-pathogenic to the host. Such vectors expressing cDNA 

20 derived from this invention may be used directly as a DNA 
vaccine . 

For preparation of chimeric strains according to this 
invention, a preferred method is to synthesize cDNA from a 
second rubella virus by preparing RNA from virus of the 

25 second strain using established techniques and then 
performing reverse transcription and PGR (polymerase chain 
reaction) on the isolated RNA using primers which flank the 
region of interest (for example, primers FI or 18 as 
described herein for synthesis of the Cendehill/RA27/3 

3 0 chimera) . The cDNA is then subjected to restriction enzyme 
digestion and resulting fragments are ligated into the 
Cendehill infectious clone which has been similarly 
digested to remove the same segment. Similarly, desirable 
portions of the Cendehill cDNA (such as the non-translated 

35 region, or non- structural genes) may be obtained by 
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ng fragment ligated into an 
rubella strain which has been 



As exemplified herein, recombinant viruses were 
5 derived from pJCND and Therien/Cendehil 1 chimeras. These 
strains were compared for their ability to grow in primary 
humian joint cells, enabling the identification of two 
regions associated with growth restriction in these cells, 
in the non- structural gene region. The identification of 
10 these regions enables the production of further recombinant 
virus strains which combine the phenotypic property of 
joint growth restriction with the immunogenicity of other 
rubella virus strain such as RA27/3, M33 or Therien. 

Sequencing of pJCND enabled the identification of 
15 nucleotide substitutions in Cendehill which are not present 
m wild-type strains. The stem-loop region which includes 
a 5' non- translated region and extends into the 
non- structural open reading frame (ORF) , contributes to 
joint growth restriction. This region has been shown to be 
20 important in viral viability and virulence in some 
a-viruses, including Sindbis virus and rubella virus 
(Niesters Sc Strauss, 1990, Pogue et al . , 1993, Pugachev oc 
Frey, 1998) . 

In the 3' subgenomic region, which includes the 
25 structural gene region, Cendehill strain contains 
67 substitutions relative to the Therien strain: three in 
the non-translated region (NTR) upstream of the 
translational start site of the subgenomic RNA, two in the 
3 'NTR, and the remainder in the coding region. Many of the 
3 0 substitutions in the structural genes occur as the third 
base of a codon and do not affect the amino-acid 
composition, leaving 16 substitutions in the 

1062 amino-acids comprising the structural genes (nine of 
which are also found in the M33 strain) . The substitutions 
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include two substitutions in the capsid prctein, two in the 
E2 glycoprotein and four in the El glycoprotein. 
Ncdif ications to the Cendehill structural genes (for 
example, by site specific mutagenesis, linker- insert ion 
5 mutagenesis or homologous recom.binat ion j to provide a 
strain with higher imm.unogeni ci ty while retaining the 
attenuating characteristics of Cendehill can therefore be 
carried out . 

Brief Description of the Drawings 

10 Figure 1 is a schematic showing the organization of 

the rubella virus genome. The RNA is polyadenylated (A^J 
and both the genomdc and sub-genomic species are capped 
(CAP) . 

Figure 2 describes the oligonucleotide primers used 
15 for reverse transcription of Rubella virus RNA and 
amplification of cDNA . Identification numbers for each 
primer appear on the left. Viral genome positions 
corresponding to nucleotide positions in Appendix I for 
seven of the primers, appear on the right. 

20 Figure 3 is a schematic showing four Cendehill cDNA 

fragments used to construct chimeric viruses and an 
Cendehill infectious clone, beneath a general 
representation of the viral genome. Restriction sites are 
identified and location of sites used for construction are 

25 indicated by the dotted lines. Primers used to generate 
each cDNA fragment are indicated by primer identification 
numbers (from Figure 2) at fragment termini. 

Figure 4 is a schematic showing the modified 
polycloning site of pCLPC, which is derived from pCL1921. 

30 Figure 5 is a schematic of a cloning strategy for 

production of Cendehill and Cendehill chimeric clones. 
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Cendehill double stranded (ds) cDNA fragments are cut using 
the appropriate restriction enzymes and inserted 
sequentially into similarly restricted regions of pROBO302. 

Figure 6 is a schematic comparing pROBO302 to a 
5 full-length Cendehill clone (pJCND and two Cendehill 
chimeras (pR0C3) and pR0C3M) . Regions without 

cross-hatching are Therien and cross-hatched regions are 
Cendehill . 

Figure 7 shows predicted 5' stem loop structures of 
10 rubella RNA' s generated by the Mfold™ 3.0 program using the 
published default settings and for linear RNA. Figures 7A, 
7B and 7C are for Cendehill, wild-type and RA27/3, 
respectively. The wild- type structure shown in Figure 7B 
is the same for the Therien and M33 strains and also the 
15 HPV77 vaccine. 

Figure 8 is a schematic showing the non- structural 
gene region and the position of amino acid substitutions in 
the Cendehill strain relative to Therien. Bars indicate 
mutations described by single letter amino acid codes. 

20 Figure 9 is a schematic showing the structural genes, 

glycosylation sites and the position of the amino acid 
substitutions in the Cendehill strain as compared to 
Therien, including those shared with M33 strain (unshaded 
bars) . Solid bars indicate mutations unique to Cendehill. 

2 5 Detailed Description of Embodiments of the Invention 

An infectious clone comprising a cDNA copy of all of 
the RNA of the Cendehill strain of rubella virus was 
produced as described below. 

Isolation of Viral RNA 
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Cendehill virions were obtained by pelleting 
supernatant virus from the medium, of Vero cells infected 
with Cendehill virus (Rchn Pharma ) for 4 hours @ 18000 rpx 
in a Scrval™ centrifuge. Viral RNA was isolated by 
5 extraction w^ioh acidified phenc 1 /guanidinium isothiocyanate 
using Trizcl-^*'' (Gibco/3RL) according to the manufacturer's 
instructions. RNA was precipitated from the aqueous phase 
by the addition c^f isopropyl alcohol (1:1) and washed with 
75% ethanol diluted in DEFC-treated H^O prior to drying and 
10 resuspension m DEPC-treated ddH^O . 

Reverse Transcription 

Specific primers complementary to the published 
sequence of the Therien strain were used to initiate the 
first strand of DNA synthesis. The primers used were #16, 

15 38 and 125 (Figure 2). For each reaction, the primer was 
mixed with viral RNA in (total volume ll^il) and heated 

for 3 min @ 90°C. RNA was then transcribed using 200U of 
Superscript II™ (Life Technologies) . The standard reaction 
mixture contained lOmM dithiothreitol and ImM dNTPs . The 

20 volume was brought tc lOOpil by addition of TE buffer and 
heated to 90°C to inactivate the reverse transcriptase. 
Enzyme, primers and excess nucleotides were removed by 
extraction of the mixture with phenol/chlorof orm/ isoamyl 
alcohol (25:24:1, by volume), followed by precipitation at 

25 -20°C in 0 . 3M sodium acetate and 66% ethanol. 

Thermal Cycling Amplification 

After generation of the first strand of DNA by reverse 
transcription, double stranded cDNA was made by thermal 
cycling amplification with a Minicycler™ (MJ Research) 
30 using the specific primers (described in Figure 2 according 
to the scheme shown in Figure 3) and repeated cycles of 
incubation with Deep Vent™ (NEE) thermostable polymerase 
with 3 '-5' proof-reading exonuclease activity. The 
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standard reaction mixture contained 400/xM dNTP, 2mM MgSO^ , 
0.5/xM primer and 1 unit of polymerase. The products were 
resuspended in for ligation into the plasmid vector, 

pCLPC/ a derivative of pCL1921 with the modified cloning 
5 site shown in Figure 4. 

Cloning 

Four cDNA fragments (as shown in Figure 3) amplified 
in pCLPC (see Figure 4), were sequentially cloned into the 
Therien infectious clone pROBO302 (Pugachev et al . 1997). 
10 The cloning strategy is outlined in Figure 5. To confirm 
insertion of the correct fragments, the sequence of each 
clone was compared with that of pROBO302 and Cendehill cDNA 
sequenced directly following reverse transcription and 
amplification . 

15 Two chimeric strains and a full-length Cendehill clone 

were produced: 

(i) pR0C3 which contains nucleotides 5357 to 9762 of 
Cendehill as shown in Figure 5 and Appendix 1, (including 
the entire structural gene region) and nucleotides 1 to 

20 5356 of the Therien strain (the majority of the 
non-structural genes and 5' non- translated region); 

(ii) pR0C3M which contains nucleotides 2803 to 9762 of 
Cendehill (see Appendix 1) and nucleotides 1-2802 of 
Therien; and, 

25 (iii) pJCND which contains the entire genomic sequence 

of the Cendehill strain (see Appendix 1) . These are shown 
in Figure 6 . 

Screening of Constructs 



BNSDOCID <WO 9961637A1 i > 



wo 99/6163 



PCT/CA99/004"9 



The consrrucrs were screened cy restricticn enzyxe 
digestion to deterxiine that the inserts were the ccrreco 
size and had the expected restri::tion pattern. Each clcne 
was also screened for infectivioy as follows. Small-scale 
5 plasmid preparations were carried out by standard 
techniques. These preparations were linearised by 

restriction digestion with EcoRl at the 3' terminus of the 
viral sequence. Positive -polarity viral RNA was generated 
by transcription from the SPG promoter and the products 

IC were transfected into BHK21 cells by electroporat ion . 
After 2 days the supernatants were transferred to Vero 
cells and supernatant virus was removed for plaque 
titration 4 days later. The 3 constructs all gave titres 
of progeny virus of 10^ - 10^/ml after three serial passages 

15 in Vero cells. The progeny viruses were designated R0C3 , 
ROC3M and JCND . 

Phenotypic Characterisation of the Recombinant Viruses 

Attenuating characteristics examined included 
temperature sensitivity and replication in human joint 
20 cells. 

(1) Temperature sensitivity: At 39°C the Cendehill strain 
is growth-restricted while wild-type strains grow normally. 
This is believed to be an attenuating characteristic as 
growth of Cendehill would be limited in infected patients 

25 by even mild fever induction. All three recombinant 
strains did not grow at 39°C indicating that they have the 
attenuating phenotype . Similarly, measurements of the 
stability of the recombinant strains on prolonged 
incubation at 37°C, relative to the Therien and Cendehill 

30 parental strains, showed that the infectivity of the 
recombinants and Cendehill decreased rapidly to 0.5% of the 
input (a 200 fold reduction) in 50 hours while the 
reduction in Therien was only 10-fold. 
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(2) Growth in human joint cells: Mapping of the region of 
the genome associated with joint cell restriction was 
carried out by examining the ability of the recombinant 
viruses to replicate after electroporat ion into human 
5 synovial cells cultured according to the method of Miki and 
Chantler (1993). The results showed that five days 
following electroporat ion , the supernatant titre of pR0C3 
was the same as that for pROB0302 (the Therien clone) . The 
titre of electroporated pR0C3M was 10-fold lower and no 

10 growth was seen with pJCND on transfection of 0.5 fig of RNA 
in each case (see Table 1) . Therefore the regions of the 
Cendehill genome containing sequences involved in joint 
cell restriction include nucleotides 2803 to 5355, which 
are present in pR0C3M but not pROC and the 5' end of the 

15 genome, nucleotides 1 to 2803 which are specific to p JCND . 
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Table I 



Rubella Virus Strain Virus yield 

(pf u/ml ) 

5 [ 



Therien 


4 


0 X lO'^ 


Cendehill 


1 


5 X 10^ 


pROBO3 0 2 


1 


9 X 10^ 


pR0C3 




5 X 10^ 


pR0C3M 


'1 


4 X 10^ 


pJCNDl 


no 


virus detected 


pJCND2 


no 


virus detected 



Sequence Analysis 

15 Further definition of the nucleotide substitutions 

involved in attenuation was determined by sequence 
analysis. The entire cDNA sequence corresponding to the 
Cendehill genome was determined using an automated 
sequencing system at the NAPS unit at the University of 

2 0 British Columbia employing Amplitaq Dye Terminator Cycle™ 

sequencing reagents (ABI) and by analysing the fluorescent 
products spectrophotometrically . The sequence obtained is 
shown in Appendix 1, It was compared with the published 
sequences of Therien strain (Dominguez et al . , 1990, later 
25 corrected in Pugachev et al . . 1997) , a consensus M33 
sequence (Clarke et al . , 1987, Zheng et al . , 1989 and 
Pugachev, 1997) and the RA27/3 sequence (Pugachev et al . 
1997). Nucleotide substitutions specific to Cendehill 
strain in the area of the first 5'NTR, the non- structural 

3 0 and structural genes, and in the 3 ' NTR are described in 

detail below, in which the nucleotide numbering is 
according to the whole genome shown in Appendix 1 and the 
amino acid numbering is according to the polyproteins as 
described above . 
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S ^ Non- translated Region (NTR) and Stem-Loop Region 

Two substitutions as shown in Table II were identified 
in this area. 

Table II 

5 nucleotide 37 : U to C 

nucleotide 55 : A to G 



These substitutions are in a stem-loop region that is 
believed to be important in controlling viral replication 
10 and translation. Alterations in this region destabilize 
the stem structure and may affect binding of cellular or 
viral factors important in viral replication. 

The stem loop structure may be predicted by computer 
programs intended to generate representations of folded 

15 structures. For the purposes of this specification, stem 
loop structures are determined by use of the Mfold™ 3.0 
program from Dr. Michael Zuker, Washington University 
School of Medicine (see: M. Zuker, et al . ; Algorithms and 
Thermodynamics for RNA Secondary Structure Prediction: A 

20 Practical Guide in RNA Biochemistry and Biotechnology. 
J. Barciszewski Sc B.F.C. Clark eds . , NATO AST Series, 
Kluwer Academic Publishers (1999); and D.H. Mathews, et al . 
(1999) Expanded Sequence Dependence of Thermodynamic 
Parameters Provides Robust Prediction of RNA Secondary 

25 Structure J. Mol . Biol. 288, 911-940). The Mfold 3.0 
program may also be obtained on the Internet at : 
http: / /mf old2 . wustl . edu/ -mfold/ cgi -bin/nph- mfold- 3 . Ocgi . 
The mfold program, default settings are used with the 
imputed RNA sequence being designated as linear. 

30 As shown in Figure 7A, the alteration at nucleotide 37 

is in the terminal loop of the stem. With reference to 
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Figures "che cerT.inal Iccp cf Cendehill is altered as 

ccnnpared to the predicted terminal loop of both wild-type 
and RA27 '3 strains. As is also shown in Figure 1, the 
substitution at nucleotide 55 increases the size of the 
5 bulge m the stem of Cendehill as compared to the bulges cf 
wild-tVDe or RA27/'5 . As is shown in Figure 1, the medial 
loop cf Cendehill is altered as compared to the medial loop 
which appears in both wild-type and RA27/3. 

Attenuation of the wild-type rubella phenotype is 
10 expected upon alterations in the nucleotide region 15-65, 
particularly in the regions 20-28, 33-43 and 52-60. 
Alterations which increase the size of the bulge such that 
a bulge to one side of the stem has at least four unpaired 
nucleotides (such as is shown in Figure 7A) is also 
15 associated with the Cendehill phenotype. 

Non- structural Gene (NSC) Region 

Several mutations are found between nucleotides 2800 
and 4550, including 5 mutations specific to the Cendehill 
strain which are present in pR0C3M but not in pROC and are 
20 therefore associated with a significant restriction in 
joint cell growth as described in Table I. These mutations 
are delineated in Table III: 

Table III 

P150 nucleotide 282S G to A aa 929 cys - tyr 

25 P150 nucleotide 3060 A to G aa 1006 asp - gly 

P150 nucleotide 3164 U to C aa 1041 tyr - his 
P150 nucleotide 3528 C to U aa 1162 ala - val 

P90 nucleotide 4530 C to U aa 1496 thr - ile 
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Two of the NSG mutations lie within or in proximity to 
a region of homology with the alphavirus NSP3 domain while 
the other two are in the protease domain and on either side 
of cys 1151 at the catalytic site. The p90 mutation is in 
5 the helicase domain. 

In addition to the foregoing, there are two mutations 
in the NSG region shown in Table IV which do not alter the 
encoded amino-acid but may influence infectivity due to 
changes in RNA structure. 

0 Table IV 

- nucleotide 118 C to U 

(This substitution may be involved m 
stem-loop structures at the 5' end) 

- nucleotide 358 U to C 

_5 (This substitution is in the region of rubella 

RNA involved in binding to the capsid protein 



Structural Gene (SG) Recrion 

The structural genes of rubella virus are produced 
20 from a 3327 nucleotide subgenomic RNA as represented in 
Figure 1. It consists of a short (78 nucleotide) 
5 ^non-translated region (NTR) , the structural genes which 
are translated from a single open-reading frame (ORF) and 
a short 3 ' NTR . Both the 3' and 5' NTRs are capable of 
25 forming stem-loop structures, can bind host cell proteins 
and are believed to be important in viral replication. In 
the entire subgenomic RNA, 67 nucleotide substitutions were 
identified in Cendehill strain when compared with the 
Therien strain (see Appendix 1) . Two are in the 5 ' NTR 
3 0 upstream of the translat lonal start site, two in the 3 ' NTR 
and the remainder are in the coding region. Many of the 
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suiost itut icns m zhe srrudural genes occur as the third 
base of a codon and dc not affect the ammo-ac-a 
composition, leaving Ic substitutions m the 
1062 amino-acids coTiprismg the structural genes, eight of 
5 v;hich are also found in the M33 strain. The reT.aining 
8 amino acid substitutions are not found m the HPV77/DS5 
or RA27/3 vaccine strains either. The nucleot ide/amrno 
acid substitutions specific to tne Cendehill strain (other 
than the 5'NTR substitutions) are shown in Table V(a) - (d) 
10 in which the amino acid numbering is according to the 
polyprotein . 

Table Via) : Protein C Region 

nucleotide 6611 U to C aa 34 ser-pro 

nucleotides 6770 A to G aa 87 thr-gly 

1 c 6771 C to G 



The substitution at aa34 occurs within a stretch of 
28 amino-acids (28-56) believed to be important in binding 
of protein C to viral RNA during encapsidat ion . A region 
20 between ammo-acids 64 and 97 has been shown to react with 
a monoclonal antibody, indicating that this is an antigenic 
region although not one of the reported major antigenic 
sites . 

Table V(b) : Protein E2 Region 

25 nucleotide 7428 C to U aa 306 ala-val 

nucleotides 7746 C to U aa 413 thr-ile 
7747 G to U 
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The alanine to valine substitution at aa306 is a 
conservative change buz lies within the first 26 residues 
of protein E2 , a region which has been identified as a 
neutralising domain. The two changes at nucleotides 7746 
5 and 7747 result m the loss of a Asn-X-Thr glycosylat ion 
site, one of four N-linked glycosylat ion sites found in 
Therien strain. The literature is conflicting as to 
whether the latter substitution is present in M33 . 

Table V(ci : Protein El 



nucleot ides 


5786 


A 


to 


G 


aa 


759 


asn- 


asp 




8788 


c 


to 


U 




M 






nucleotide 


8864 


r 


to 


A 


aa 


785 


leu- 


met 


nucleotide 


9180 


A 


to 


U 


aa 


890 


his - 


leu 


nucleotide 


9254 


G 


to 


A 


aa 


915 


ala- 


thr 



15 ^ 

The four alterations in El all occur in the region of 
the protein which is extruded into the lumen of the 
endoplasmic reticulum, and is therefore also exposed on the 
surface of the mature virion. The first substitution at 

20 amino-acid 759 alters an asparagine to an aspartic acid 
residue with the resulting loss of an N-linked 
glycosylation site, one of three in El, all of which are 
believed to be utilised. None of the substitutions in El 
are in regions identified as dominant epitopes of the 

25 cell-mediated immune response, nor in regions identified by 
monoclonal antibodies as being associated with 
hemagglutination or neutralisation. However they may alter 
conformation-dependent epitopes associated with the 
humoral response affecting the immunogenicity of Cendehill 

30 strain which reacts poorly with polyclonal antisera to the 
Therien strain m immunoprecipitat ion and immunoblot 
assays . 
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Table V ' d ' : 3 ^ NTR 

nucleotide 9 731 G zc Z 

nucleotide 974 0 C to U 

nucleotide 9741 C to U 



This region, like the 5 ' NTR is involved in RNA 
replication. Although the substitutions at nucleotides 
9731 and 9740 are also found in the M33 strain, they may 
affect attenuation as M33 is a less cytopathic strain than 
10 Therien, 

The substitutions identified in the structural genes 
of Cendehill are responsible for the lower antigenicity 
and immunogenicity of this strain relative to Therien, M33 
or RA27/3. Using the Cendehill infectious clone, 

15 alterations to the structural genes (for example, by 
site-directed metagenesis) would enable the antigenicity of 
this strain to be repaired. This would provide a novel 
rubella strain with the attenuating phenotype of Cendehill, 
including restriction of growth in joint cells, but with 

20 the immunogenic properties of either a wild strain like 
Therien or the RA27/3 vaccine strain. Alternatively, a 
chimeric strain can be produced comprising (for example) 
the entire structural gene region of RA27/3 inserted into 
the Cendehill infectious clone. Either of these constructs 

25 would provide an improved attenuated rubella vaccine. 

Production of Modified Rubella Virus Strains 

Altered strains can be produced by standard 
recombinant DNA technology as described in many current 
textbooks including "Molecular Cloning: A Laboratory 
30 Manual," edited by Maniatis,T., Fritsch E.F., and 
Sambrook, J., (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, N.Y. 1989) or "Current Protocols in Molecular 
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Biology" edited by Ausubel et al , , (Wiley Interscience , 
1987) . 

To alter specific nucleotides in the structural gene 
region, ol igonucleot ide - directed mutagenesis and gene 
5 amplification technology can be used as described by 
Higuchi (1989) . This procedure involves synthesis of 
oligonucleotides specific for the region to be modified, 
containing the required nucleotide substitution, as well as 
an appropriate restriction site. This can then be used as 

10 one primer for a gene amplification reaction encompassing 
the region of interest. A second primer is chosen which 
includes a unique restriction site and which will yield a 
fragment of suitable size. Following amplification of the 
fragment which now has the requisite nucleotide 

15 substitution incorporated, the fragment is cloned into the 
infectious clone replacing the original sequence. In this 
way, mutations can be incorporated into the gene sequence 
either singly or sequentially until the resulting virus has 
the properties wanted. 



2 0 Production of Chimeric Virus Strains 



A cDNA clone including the entire structural gene 
region of a rubella stain such as RA27/3 can be made in the 
following steps: (i) isolation of viral RNA from 

high-titre virus stock, (ii) first strand cDNA synthesis 

25 using a specific primer for the 3 'end, (iii) amplification 
of the structural gene region using primers Fl and 18 
(Figure 2) , (iv) digestion of the amplified fragment and 
also pCND with Bgl II and EcoRl , and (v) cloning of the 
amplified fragment into pJCND (previously separated from 

30 its digested insert) . 

Following the above-described scheme, a chimeric 
Cendehill/RA27/3 clone whose genome includes a first 
portion which is equivalent to the Cendehill 5" non- 
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translated RNA, Cendehill plBO and p50 and a second portion 
eouivaleni to the structural gene region and tne 3' 
non- translated region of RA27/3 strain was rr.ade . This 
clone can be used to produce a chimeric virus that 
5 expresses the structural proteins of RA27/3 but has tne 
deterT.inants of arthrotropism found in the genetic 
structure at the 5' end and in the non- structural genes of 
Cendehill strain . 



This construct was produced by synthesising a cDNA/PCR 
10 fragment, using RA27/3 RNA as template, equivalent to the 
18-Fl fragment shown in Figure 2, This fragment was then 
inserted into the Cendehill infectious clone using the 
restriction enzymes Bglll and EcoRl, in an identical manner 
to the synthesis of pROCB described elsewhere in this 
15 specification. The new chimeric clone was sequenced 
through nucleotides 6611 and 6770/6771 as well as through 
nucleotides 8786/8788 and 8864 to ensure that replacement 
of the 18-Fl fragment had occurred. The published sequence 
of RA27/3 indicates that the latter strain has the same 
20 nucleotides as Therien strain at these positions (Pugachev 
KV, Abernathy ES and Frey TK . Archives of Virology 14 2 
1165-1180, 1997: Genomic sequence of the RA27/3 vaccine 
strain of rubella virus) while Cendehill is modified in 
these regions as disclosed herein. 



2 5 Screening of Novel Rubella Strains 



Modified cDNA clones incorporated in the pCL1921 
plasmid can be transcribed into complete infectious RNA 
from the SP6 promoter. The RNA produced can be transfected 
into BHK-21 cells by a variety of techniques including 
30 electroporation or use of Lipof ectamine™ (Gibco/BRL) . The 
transfected RNA is translated and replicated in the cell to 
yield virus with altered phenotypic properties according to 
the mutations introduced- In this way, seed stocks of 
rubella strains of this invention may be produced. 
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Phenotyoic properties cf rubella strains of this 
invention can be monitored for characteristics associated 
with attenuation and immunogenicity . For example, yield, 
temperature sensitivity and the ability to grow in human 
5 joint tissue can be determined as described previously for 
pR0C3 and pR0C3M. The antigenicity of the strains can be 
assessed using standard enzyme- linked immunosorbent assays, 
immunoprecipitation assays and immunoblots with human 
rubella seropositive antisera. The efficacy of a strain 

10 for eliciting a strong neutralising antibody response can 
be measured in rabbits and compared with the current 
vaccine strain, RA27/3 and also the parental Cendehill 
strain. In this way, novel strains can be assessed for 
characteristics that would make them suitable for use as 

15 improved attenuated vaccines. 

Attenuated rubella strains may be used as a seed stock 
for manufacturing vaccine. Virus from such a stock may be 
combined with a variety of stabilisers such as saline, 
phosphate buffer, polyethylene glycol, glycerin as 
20 currently used in vaccine preparations. The vaccine may be 
produced in lyophilised form to aid long-term preservation. 
It can also be combined with other vaccines such as mumps 
and measles vaccines as in the current M-M-R formulation. 



In addition to use of rubella virus strains of this 
25 invention as live attenuated vaccines as described above, 
modified infectious cDNA clones may also be used to produce 
a DNA vaccine against rubella virus, either singly or in 
combination with other DNA vaccines. For this, the cDNA of 
the rubella virus strain is sub-cloned into an expression 
30 vector (either plasmid or viral) which contains a suitable 
eukaryotic promoter. Either the entire rubella virus 
genome, the structural genes or immunogenic regions of the 
structural genes can be used in this manner to directly 
immunise patients. The DNA vaccine is taken up by cells 
35 and transcribed from the eukaryotic promoter to yield RNA 
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which :Ls translated into viral proteins. These m tun: 
elicit an immune response. 

Other uses of the Cendehill infectious clone and its 
derivatives include the production cf large quantities of 

5 virus for use as antigen in enzyme - linked immunosorbent 
assays to assess human antibody levels against rubella. In 
view of variations m the antigenicity of the different 
rubella virus strains, it would be preferable to use 
antigen known to react optimally according to the vaccine 

IC strain delivered. For example, a virus strain with the 
structural gene region identical to the vaccine in use, but 
altered in the non- structural genes or NTR regions to 
improve viral yield for antigen production may be 
propagated. Subsequently, the strain for use in 

15 immunoassays would be treated to produce a non- infect ious 
antigen preparation. Alternatively, the structural 

proteins alone could be produced from a suitable expression 
vector to yield an antigen preparation with the correct 
specificity . 

2 0 Although various aspects of the present invention have 

been described in detail, it will be apparent that changes 
and modification of those aspects described herein will 
fall within the scope of the appended claims. All 
publications and references referred to herein are hereby 

25 incorporated by reference. 
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APPENDIX 1 
Sequence of Cendehill vims cDNA 

5' NTR r*pl50 

1 CAATCGAAGC 7ATCGGACCT CGCTTAGGAC TCCTATCCCC |a?G GAG AAA 

50 CTC CTG GAT GAG GTT CTT GCC CCC GGT GGG OCT TAT AAC TTA ACC GTC GGC 

ICl AGT I^GG GTA AGA GAG CAT GTC CGC TCA ATT GTC GAG GGC GCG TGG GAA GTG 

152 CGC GAT GTT GTT ACC GCT GCC CAA AAG CGC GCC ATC GTA GCC GTG ATA 

2 00 CCC AGA CCT GTG TTC ACG GAG ATG CAG GTC AGT GAT CAC CCA GCA CTC CAC 

251 GCA ATT TCG CGG TAT ACC CGC CGC CAT TGG ATC GAG TGG GGC CCT AAA GAA 

30: GCC CTA CAC GTC CTC ATC GAC CCA AGC CCG GGC CTG CTC CGC GAG GTC 

350 GCT CGC GTC GAG CGC CGC TGG GTC GCA CTG TGC CTC CAC AGG ACG GCA CGC 

401 AAA CTC GCC ACC GCC CTG GCC GAG ACG GCC AGC GAG GCG TGG CAC GCT GAC 

451 TAG GTG TGC GCG CTG CGT GGC GCA CCG AGC GGC CCC TTC TAG GTC CAC 

502 CCC GAG GAC GTC CCG CAC GGC GGT CGC GCC GTG GCG GAC AGA TGC TTG CTC 

551 TAG TAG AC A CCC ATG CAG ATG TGC GAG CTG ATG CGC ACC ATT GAC GCC ACC 

60 2 TTG CTC GTG GCG GTT GAC TTG TGG CCG GTC GCC CTT GCG GCC CAC GTC 

65 c GGC GAT GAC TGG GAC GAC CTG GGC ATT GCC TGG CAT CTC GAC CAT GAC GGC 

701 GGT TGC CCC GCC GAT TGT CGT GGA GCC GGC GCT GGG CCC ACG CCC GGC TAG 

7 52 ACC CGC CCC TGC ACC ACA CGC ATC TAC CAA GTC CTG CCG GAC ACC GCC 

800 CAC CCC GGG CGC CTC TAC CGG TGC GGG CCC CGC CTG TGG ACG CGC GAT TGC 

851 GCC GTG GCC GAA CTC TCA TGG GAG GTT GCC CAA CAC TGC GGG CAC CAG GCG 

902 CGC GTG CGC GCC GTG CGA TGC ACC CTC CCT ATC CGC CAC GTG CGC AGC 

950 CTC CAA CCC AGC GCG CGG GTC CGA CTC CCG GAC CTC GTC CAT CTC GCC GAA 

1001 GTG GGC CGG TGG CGG TGG TTC AGC CTC CCC CGC CCC GTG TTC CAG CGC ATG 

105 2 CTG TCC TAC TGC AAG ACC CTG AGC CCC GAC GCG TAC TAC AGC GAG CGC 

1100 GTG TTC AAG TTC AAG AAC GCC CTG AGC CAC AGC ATC ACG CTC GCG GGC AAT 

1151 GTG CTG CAA GAG GGG TGG AAG GGC ACG TGC GCC GAA GAA GAC GCG CTG TGC 

1202 GCA TAC GTA GCC TTC CGC GCG TGG CAG TCT AAC GCC AGG TTG GCG GGG ATT 

1253 ATG AAA GGC GCG AAG CGC TGC GCC GCC GAC TCT TTG AGC GTG GCC GGC TGG 

1304 CTG GAC ACC ATT TGG GAC GCC ATT AAG CGG TTC TTC GGT AGC GTG CCC CTC 

1355 GCC GAG CGC ATG GAG GAG TGG GAA CAG GAC GCC GCG GTC GCC GCC TTC 

140 3 GAC CGC GGC CCC CTC GAG GAC GGC GGG CGC CAC TTG GAC ACC GTG CAA CCC 

14 54 CCA AAA TCG CCG CCC CGC CCT GAG ATC GCC GCG ACC TGG ATC GTC CAC GCA 
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15 05 GCC AGO GCA GAC CGC CAT TGC GCG 
155 3 CGC GAA CGT CCT TCC GCG CCT GCC 
1604 CCG CCG TGG CTG TTC GCC GAG CGC 

16 55 TTC GAG GCT CTC CGC GCG CGC GCC 
1703 CTG GCT CCA CGC CCT GCG CGG TAG 

17 54 CAC CAC GGT CCG TGG CTC ACC CTT 
1805 CTG GTC TTA TGC GAC CCA TTT GGC 
1853 CAC TTC GCC GCC GGC GCG CAT AT3 
1904 TTT GTC CGT GTC GTG CCT CCA CCC 
1955 AGA GCG TGG GCG AAG TTC TTC CGC 
2003 CTC GGC GAG CCG GCA GTC ATG CAC 
2054 CAG CTG ATC GCA CTG GCC TTG CGC 
2105 GCA CTC TCG GTG CGT GAC CTG CCC 
2153 GCG GTC ACC GCC GCC GTG CGC GCT 
2 2 04 CCG CCA CCC GGC GAC CCC CCG CCG 
2 2 55 CAC TCG GAC GCC CGC GGC ACT CCG 
2 3C3 CCG CCG CCC GCC CCC AGC CCG CCC 
2 3 54 CCT CCC ACT CCC GCG GAG CCG GCG 
2405 GTC GCC TAC GAA CCG AGC GGC CCC 
24 53 GAC AGC GAC ATC GTT GAA AGT TAC 
2 5 04 CGA GTC CGC GAC ATC ATG GAC CCA 
2555 GCC GCC AAC GAG GGG CTG CTG GCC 
2603 TTT GCC AAC GCC ACG GCG GCC CTC 
2 6 54 TGC CCC ACC GGC GAG GCG GTG GCG 
2705 CAC ATC ATC CAC GCC GTC GCG CCG 
2753 CTC GAG GAG GGC GAA GCG CTG CTC 
2 8 04 CTA GCC GCC GCG CGT CGG TGG GCG 
2655 GGC GTC TAC GGC TGG TCT GCT GCG 
2903 GCT ACG CGC GCC GAG CCC GTC GAG 

2 9 54 GAC CGC GCC ACG CTG ACG CAC GCC 
3005 GCC AGG CGC GTC AGT CCT CCT CCG 
3053 GCC GGT GGC CCG GGC CGA CCG GCT 

3 104 CCC CTT GGG GAT GCC ACC GCG CCC 



TGC GCT CCC CGC TGC GAC GTC CCG 
GGC CCG CCG GAT GAC GAG GCG CTC ATC 
CGT GCC CTC CGC TGC CGC GAG TGG GAT 
GAT ACG GCG GCC GCG CCC GCC CCG 
CCC ACC GTG CTC TAC CGC CAC CCC GCC 
GAC GAG CCG GGC GAG GCT GAC GCG GCC 
CAG CCG CTC CGG GGC CCT GAA CGC 
TGC GCG CAG GCG CGG GGG CTC CAG GCT 
GAG CGC CCC TGG GCT GAC GGG GGC GCC 
GGC TGC GCC TGG GCG CAG CGC TTG 
CTC CCA TAC ACC GAT GGC GAC GTG CCA 
ACG CTG GCC CAA CAG GGG GCC GCC TTG 
GGG GGT GCA GCG TTC GAC GCA AAT 
GGC CCC GGC CAG CTC GCG GCC ACG TCA 
CCG CGC CGC GCA CGG CGA TCG CAA CGG 
CCC CCC GCG CCT GTG CGC GAC CCG 
GCG CCA CCC CGC GCG GGT GAC CCG GTC 
GAT CGC GCG CGT GAC GCC GAG CTG GAG 
CCC ACG TCA ACC AAG GCA GAC CCG 
GCC CGC GCC GCC GGA CCT GTG CAC CTC 
CCG CCT GGC TGC AAG GTT GTG GTC AAC 
GGC TCC GGC GTG TGC GGT GCC ATC 
GCT GCA GAC TGC CGG CGC CTC GCC CCA 
ACA CCC GGC CAC GGC TGC GGG TAC ACC 
CGG CGT CCT CGG GAC CCC GCC GCC 
GAG CGC GCC TAC CGC AGC ATC GTC GCG 
TAT GTC GCG TGC CCC CTC CTC GGC GCT 
GAG TCC CTT CGA GCC GCG CTC GCG 
CGC GTG AGC CTG CAC ATC TGC CAC CCC 
TCC GTG CTC GTC GGC GCG GGG CTC GCT 
ACC GAG CCC CTC GCA TCT TGC CCC 
CAG CGC AGC GCG TCG CCC CCA GCG ACC 
GAG CCC CGC GGA TGC CAG GGG TGC GAA 
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GCG 


CGG 


5357 


ATC 


TTT 


GCC 


GGC 


ATG 


5408 


AAA 


GCC 


ACC 


TTG 


AAG 


5456 


GAG 


GAC 


TGC 


CAC 


GCC 


5507 


GCC 


AAG 


GAG 


TGG 


GTC 


5558 


ATT 


ATC 


ATG 


CGC 


GCC 


^ t A £ 
D D U D 








GAG 




5657 


ATC 


GAG 


GTC 


GAC 


TTC 


5706 


GAC 


GTC 


GAG 


CTC 


GAG 


5756 


GAA 


GAC 


TAC 


CGC 


GCG 


5807 


GGC 


TCC 


ACT 


GAG 


ACC 


5858 


CTG 


CAC 


AAC 


ACC 


ACC 


5906 


AAA 


GGC 


GTG 


CGC 


TGG 


5957 


CTC 


CCC 


GAG 


GGC 


GCG 


6006 


GGC 


TTG 


TTC 


GGC 


TTC 


6056 


CCC 


AGC 


TTC 


TGC 


GGG 


6107 


ATG 


CAC 


CAG 


GCA 


ATC 


6156 


GAA 


GAA 


CAG 


CAG 


GTG 


6206 


GCT 


CTG 


CCT 


GAC 


ACC 


6257 


GAG 


CGC 


GTC 


CTC 


GCT 


6306 


GGC 


CTC 


GAC 


CAC 


CCG 


6356 


CCC 


TAC 


GCG 


CGC 


GCC 



GAC 


GCT 


GGG 


GCA 


CTG 


GCG 


CGC 


GTT 


GTC 


GCC 


GTC 


GAG 


ATC 


CCC 


GAG 


GCC 


CAA 


GAC 


CTC 


GTC 


TTC 


GGC 


CGT 


GCC 


GTG 


ACT 


GAG 


GGC 


GAA 


CGA 


CTC 


AAC 


AAG 


AAT 


CAC 


ACC 


GTT 


TGC 


GCC 


GTG 


CGG 


CGC 


ACT 


GCT 


GTG 


GCC 


CGC 


CAG 


CGC 


GTC 


ACT 


GCT 


GGG 


GTC 


CGG 


ATC 


GAC 


CTC 


ACT 


GAT 


GAG 


CTC 


ACC 


GAC 


CGC 


TAC 


TGT 


ACC 


GCC 


CAG 


AGC 


CTG 


TGC 


GTA 


GAC 


GCC 


GCC 


CTC 


GCT 


CAG 


GGG 


AAA 


GCC 


GGC 


CAG 


GTT 


ATG 


TCC 


CCG 


CAT 


TTG 


CGC 


CCG 


CAA 


TTC 


CTT 


GAT 


GCG 


TGG 


TGG 






ACT 


GAG 


TTC 


GAC 


ATG 


AAC 


ATT 


AGC 


GCC 


GCT 


CTC 


TTG 


CTC 


CGC 


GCC 


GGC 


AGC 


TAT 


GGC 


TGC 


GAG 


CGC 


ACA 


AGC 


GTG 


GCC 


ATG 


TGC 


ATG 


GCT 


GCC 


GGG 


ATT 


TTC 


CAG 


GGC 


CGC 


AGT 


GCG 


GCA 


CTC 


AAG 


CAC 


ATT 


CCA 


GTG 


AAG 


CAT 


CAC 


GTC 


GGC 


ACC 


GCG 


GCC 


AAG 


GTG 


CTT 


TGC 


CGC 


CGT 


GCC 


CTC 


CTC 


GAC 


CGC 


CTC 


GTT 


GCC 


GCC 


AAT 


GCT 


GCG 


ATC 


GTG 


CGC 


GAA 


CTT 


ACC 


GCC 


ACC 


ATC 


GGC 


GCG 


CTC 


AAT 


CTC 


CAC 


GAC 


GCT 


GAC 



GAA 
CAG 


CTC 
GCA 


AAG 
CCA 


GAG 
CCA 


GTT 

CCG 


TTG 


GTG 


CCG 


CCC 


TTC 


TGC 


CCC 


GGC 


CAC 


CCC 


CAT 


TAC 




GAA 


GTG 


CGG 


TAT 


ATG 


CGC 


GAG 


ATG 


CCC 


GGA 


ACG 


GAA 


TAC 
CAC 


CGC 
CCG 


GCG 
CGC 


GGC 
CCT 


GAG 
TTC 


GAT 
CGC 


GCC 


CAG 


GAG 


TGG 


CGC 


ATG 


GTC 


TAC 


ACG 


CAG 


ATG 




GCG 
AGC 
GGC 


CGC 
GTC 
CCC 


CGC 
CCC 
AGG 


TAT 

GCC 
GAC 


CCT 
TTC 
ACC 


GAG 

CTC 


CTT 


GAG 


ATT 


CGG 


GCA 


TGG 


TTC 


CGC 


GCG 


ATC 


CAG 


AAG 


GTG 


GCC 


GCT 


GGC 


CAT 




PZiT 
u/\ 1 

CAG 


T&P 

ACC 


R pp 
CTC 


APp 
GCT 


ACT 


CiCC 
CGG 


GGC 


CTC 


CCT 


TGC 


GCC 




TGC 
GGC 
ATG 
GAC 


ACT 
GAG 
CGC 
GAT 


CTG 
CCT 
ATG 
ATG 


CGC 
GCC 
GTC 
GTC 


GAA 
ACG 
CCC 
ATC 


CTG 
CTG 

TTC 


TGG 


ACC 


CCC 


GCC 


GAG 


GTG 


GTG 


AGC 


ACC 


CCA 


ACC 




GGC 


CTC 


TTC 


CAT 


GAT 


GTC 


TTC 


GAC 


CCC 


GAC 


GTG 


CTT 


CGG 


GGG 


GTC 


TAC 


GCG 




TAC 


TAT 


GAC 


TAC 


AGC 


GCG 


GCG 


TAC 


GCG 


CGG 


GGG 


CGC 


GAG 


GAG 


ATT 


CAG 


ACC 




TAA 


|CGC CCC 


CGT 


ACG 


TGG 
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f— ► Bubgenome (NTK) 

64C-^ GGC CTT TAA TCT CAC CTA CTC TAA CCA [GGTCkTCkCC ChCCGTVGTT 

64 51 TCGCCGCATC TGOTGGGTAC CCCACTCTTG CCATTCGGGA GAGCCCCAGC GTGCCCGA 



6 5C: ATG GOT TCC ACT ACC CCC ATC ACC ATG GAG GAC CTT GAG AAG GCC 



655^ 


CTC 


GAG 


GCA 


CAA 


TCC 


CGC 


GCC 


CTG 


CGC 


GCG 


GAA 


CTC 


GCC 


GCC 


GGC 


GCC 


TCG 


6606 


CAG 


CCG 


CGC 


CGG 


CCG 


CGG 


CCG 


CCG 


CGA 


CAG 


CGC 


GAC 


TCC 


AGC 


ACC 


TCC 




66S6 


GGA 


GAT 


GAC 


TCC 


GGC 


CGT 


GAC 


TCC 


GGA 


GGG 


CCC 


CGC 


CGC 


CGC 


CGC 


GGC 


AAC 


670^ 


CGG 


GGC 


CGT 


GGC 


CAG 


CGC 


AAG 


GAC 


TGG 


TCC 


AGG 


GCC 


CCG 


CCC 


CCC 


CCG 


GAA 


67S6 


GAG 


CGG 


CAA 


GAA 


GGT 


CGC 


TCC 


ChK 


ACT 


CCG 


GCC 


CCG 


AAG 


CCA 


TCG 


CGG 




6806 


GCG 


CCG 


CCA 


CAA 


CAG 


CCT 


CAA 


CCC 


CCG 


CGC 


ATG 


CAA 


ACC 


GGG 


CGT 


GGG 


GGT 


685 7 


TCT 






LG<- 


i 


















TTC 


A(j 


A 




6896 


GTG 


GCG 


CGT 


GGC 


CTC 


CGC 


CCG 


CCT 


CTC 


CAT 


GAC 


CCT 


GAT 


ACC 


GAG 


GCA 




6956 


CCC 


ACC 


GAG 


GCC 


TGC 


GTG 


ACC 


TCA 


TGG 


CTT 


TGG 


AGC 


GAG 


GGC 


GAA 


GGC 


GCG 


7007 


GTC 


TTC 


TAG 


CGC 


GTC 


GAC 


CTG 


CAT 


TTC 


ACC 


AAC 


CTG 


GGC 


ACC 


CCC 


CCA 


CTC 


7058 


GAC 


GAG 


GAC 


GGC 


CGC 


TGG 


GAC 


CCT 


GCG 


CTC 


ATG 


TAG 


AAC 


CCT 


TGC 


GGG 




7106 


CCT 


GAG 


CCG 


CCT 


GCT 


CAC 


GTC 


GTC 


CGC 


GCG 


TAG 


AAC 


CAA 


CCT 


GCC 


GGC 


GAC 


7157 


GTC 


AGG 


GGC 


GTT 


TGG 


GGT 


AAA 


GGC 


GAG 


CGC 


ACC 


TAG 


GCC 


GAG 


CAG 


GAT 


TTC 


7208 


CGC 


GTC 


GGC 


GGC 


ACG 


CGC 


TGG 


CAC 


CGC 


CTG 


CTG 


CGC 


ATG 


CCA 


GTG 


CGC 




7256 


GGC 


CTC 


GAC 


GGC 


GAC 


AGC 


GCC 


CCG 


CTT 


CCC 


CCC 


CAC 


ACC 


ACC 


GAG 


CGC 


ATT 


7307 


GAG 


ACC 


CGC 


TCG 


GCG 


CGC 


CAT 


CCT 


TGG 


CGC 


ATC 


CGC 


TTC 


GGT 


GCC 


CCC 


CAG 


7356 


GCC 


TTC 


CTC 


GCC 


GGG 


CTC 


TTG 


CTC 


GCC 


GCG 


GTC 


GCC 


GTT 


GGC 


ACC 


GCG 





74 06 CGC GCCj GGG CTC CAG CCC CGC GTT GAT ATG GCG GCA CCC CCT ACG CCG CCG 

74 57 CAG CCC CCC CGT GCG CAC GGG CAG CAT TAG GGT CAC CAC CAC CAT CAG CTG 

7 506 CCG TTC CTC GGG CAC GAC GGC CAT CAC GGC GGC ACC TTG CGC GTC GGC 

7 5 56 CAG CAT CAC CGA AAC GCC AGC GAC GTG CTG CCC GGC CAC TGG CTC CAA GGC 

7 6 07 GGC TGG GGT TGC TAG AAC CTG AGC GAC TGG CAC CAG GGC ACT CAT GTC TGT 

7 6 58 CAC ACC AAG CAC ATG GAC TTC TGG TGT GTG GAG CAC GAC CGA CCG CCG 

7 7 06 CCC GCG ACC CCG ACG CCT CTC ACC ACC GCG GCG AAC TCC ATT ACC GCC GCC 

7757 ACC CCC GCC ACT GCG CCG GCC CCC TGC CAC GCC GGC CTC AAT GAC AGC TGC 

7608 GGC GGC TTC TTG TCT GGG TGC GGG CCG ATG CGC CTG CGC CAC GGC GCT 

7 8 56 GAC ACC CGG TGC GGT CGG TTG ATC TGC GGG CTG TCC ACC ACC GCC CAG TAG 

7 9 07 CCG CCT ACC CGG TTT GGC TGC GCT ATG CGG TGG GGC CTC CCC CCC TGG GAA 
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7956 CTG GTC GTT CTT ACC GCC CGC CCC GAA GAC GGC TGG ACT TGC CGC GGC 

8006 GTG CCC GCC CAT CCA GGT ACC CGC TGC CCC GAA CTG GTG AGC CCC ATG GGA 

80 5 7 CGC GCG ACT TGC TCC CCA GCC TCG GCC CTC TGG CTC GCC ACA GCG AAC GCG 

8108 CTG TCT CTT GAC CAC GCG CTC GCG GCC TTT GTC CTG CTG GTC CCG TGG 

6156 GTC CTG ATA TTT ATG GTG TGC CGC CGC GCC TGT CGC CGC CGC GGC GCC GCC 

82 07 GCC GCC CTC ACC GCA GTC GTC CTG CAG GGG TAC AAC CCC CCC GCC TAT GGC 



8258 [GAG GAG GOT TTC ACC TAC CTC TGC ACT GCA CCG GGG TGC GCC ACT CAA 
6 306 ACA CCT GTC CCC GTG CGC CTC GCT GGC GTC CGC TTT GAG TCC AAG ATC GTG 



8408 TGC GAG ATC CCC ACT GAT GTC TCG TGC GAG GGC TTG GGG GCC TGG GTA 

84 56 CCC ACA GCC CCT TGC GCG CGC ATC TGG AAT GGC ACA CAG CGC GCG TGC ACC 

8507 TTC TGG GCT GTC AAC GCC TAC TCC TCT GGC GGG TAC GCG CAG CTG GCC TCT 

855 8 TAC TTC AAC CCT GGC GGC AGC TAC TAC AAG CAG TAC CAC CCC ACC GCG 

860 6 TGC GAG GTT GAA CCT GCC TTC GGA CAC AGC GAC GCG GCC TGC TGG GGC TTC 

8657 CCC ACC GAC ACC GTG ATG AGC GTG TTC GCC CTT GCT AGC TAC GTC CAG CAC 

87 0 8 CCT CAC AAG ACC GTC CGC GTC AAG TTT CAT ACA GAG ACT AGG ACC GTC 

87 5 6 TGG CAA CTC TCC GTA GCC GGC GTG TCG TGC GAT GTC ACC ACT GAA CAC CCG 

8 807 TTC TGC AAC ACG CCG CAC GGA CAA CTC GAG GTC CAG GTC CCG CCC GAC CCT 

8 858 GGG GAC ATG GTT GAG TAC ATT ATG AAT TAC ACC GGC AAT CAA CAG TCC 

8906 CGG TGG GGC CTC GGG AGC CCG AAC TGT CAT GGC CCC GAT TGG GCC TCC CCG 

8 9 57 GTT TGC CAA CGC CAT TCC CCT GAC TGC TCG CGG CTT GTG GGG GCC ACG CCA 
9008 GAG CGT CCC CGG CTG CGC CTG GTC GAC GCC GAC GAC CCC CTG CTG CGC 

905 6 ACT GCC CCT GGG CCC GGC GAG GTG TGG GTC ACG CCT GTC ATA GGC TCT CAG 

9107 GCG CGC AAG TGC GGA CTC CAC ATA CGC GCT GGA CCG TAC GGC CAT GCT ACC 

9158 GTC GAA ATG CCC GAG TGG ATC CTC GCC CAC ACC ACT AGC GAC CCC TGG 

9 2 06 CAC CCA CCG GGC CCC TTG GGG CTG AAG TTC AAG ACA GTT CGC CCG GTG ACC 
9 2 57 CTG CCA CGC GCG TTA GCG CCA CCC CGC AAT GTG CGT GTG ACC GGT TGC TAC 
9308 CAG TGC GGT ACC CCC GCG CTG GTG GAA GGC CTT GCC CCA GGG GGA GGG 

9 3 56 AAC TGC CAT CTC ACC GTC AAT GGC GAG GAC GTC GGC GCC TTC CCC CCT GGG 

94 07 AAG TTC GTC ACC GCC GCC CTC CTC AAC ACC CCC CCG CCC TAC CAA GTC AGC 

94 5 8 TGC GGG GGC GAG AGC GAT CGC GCG AGC GCG CGG GTC ATT GAC CCC GCC 

9506 GCG CAA TCG TTT ACC GGC GTG GTG TAT GGC ACA CAC ACC ACT GCT GTG TCG 




83 57 GAC GGC GGC TGC TTT GCC CCA TGG GAC CTC GAG GCC ACT GGA GCC TGC ATC 
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SIS- GAG ACC C&G CAG ACC TGG GCG GAG ?GG GCT OCT GCT CAT VGG TGG CAG CTC 

S60e ACT CT<3 GGC GCC ATT TGC GCC CTC CCA CTC GCT GGC TTA CTC GCT TGC 3. 

' V 

9656 TGT GCC AAA TGC TTG TAC TAG TTG CGC GGC GCT ATA GCG CCG CGC TAG ; TGG 
97C7 GCCCCCGCGC GAAACCCGCA CTAGCCCACT AGATTTCCGC ACCTGTTGCT GTATAG 
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VJE CLAIM: 



1. A nucleic acid corresponding to a nucleic acid 
encoding a Cendehill rubella protein selected from the 
group consisting of: pl50; p90; C; El; E2 . 

5 2 . A nucleic acid corresponding to a non- translated 
region of the Cendehill genome. 

3. The nucleic acid of claim 2 wherein the non- translated 
region is a 5' non- translated region in which at least one 
of a terminal loop or a medial loop is different in size as 

10 compared to wild-type rubella 5' non- translated region. 

4. A nucleic acid which includes a sequence or sequences 
of nucleotides corresponding to a 5 ' non- translated region, 
p90 and pl50 of Cendehill. 

5. DNA including a sequence of nucleotides corresponding 
15 to the entire Cendehill genome as shown in Appendix 1. 

6. The nucleic acid of any one of claims 1-4 which is 
DNA. 



7. The nucleic acid of any one of claims 1-4 which is 
RNA. 

20 8. The nucleic acid of any one of claims 1-7 further 
including one or more sequences of nucleotides 
corresponding to all or part of a genome of a rubella 
strain other than Cendehill. 



9. A plasmid or viral vector that includes a nucleic acid 
25 according to any one of claims 1-5 or 8, wherein the 
nucleic acid is DNA. 
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IC. DNA ccncrising a sequence of nucleccides coTplenent ary 
ro rucella genoxic RNA capable of encoding an mfectioos 
virus of the Cendehill strain or having an attenuating 
phenoty^re comparable to Cendehill. 

5 11. DNA including a firso sequence cf nucleotides 
corresponding to one or more of: a non- trans lated region, 
iDlBQ, p50, C, El and E2 of Cendehill sorain; and, a second 
sequence of nucleotides that is derived from a rubella 
virus strain other than Cendehill, wherein said DNA encodes 
10 an infectious rubella virus. 

12. DNA comprising sequences of nucleotides corresponding 
to nucleotides 1 to 5355 of Cendehill and nucleotides 5356 
to 9762 of RA27/3 . 

13. The DNA of claim 10, 11 or 12, in a plasmid or viral 
15 vector capable of replication and transcription of the DNA. 

14. DNA comprising one or more sequences of nucleotides 
encoding all or part of one or more of: pl50, p90, C, E2 
and El of Cendehill virus, incorporated into an expression 
vector . 

20 15. A method of producing rubella virus comprising the 
steps of transcribing the DNA of claim 14 into RNA; 
transfecting cells with said RNA; and, recovering rubella 
virus from the transfected cells. 

16. Rubella virus obtained by the method of claim 15 
25 wherein the DNA transcribed includes a sequence of 

nucleotides derived from a rubella virus strain other than 
Cendehill . 

17, A method of producing DNA encoding a recombinant or 
chimeric rubella virus exhibiting the lack of 
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arthrotropicity of Cendehill virus, comprising a step 
whereby : 

(a) nucleotides in Cendehill cDNA encoding viral 
structural protein are altered such that the protein so 
5 encoded increases immunogenici ty of a recombinant rubella 
virus comprising said protein; 



(b) nucleotides in the non- translated regions or 
non-structural protein region of cDNA for rubella virus 
other then Cendehill are altered to decrease 
10 arthritogenicity of a recombinant rubella virus coded for 
by the altered cDNA; or, 



(c) cDNA for one or more of a Cendehill 
non- translated region, non- structural protein pl50, and 
non-structural protein p90 is joined to cDNA for a rubella 
15 virus other then Cendehill to produce DNA corresponding to 
a complete RNA genome of a chimeric rubella virus. 

18. An infectious clone for a rubella virus comprising a 
vector which includes cDNA corresponding to one or more 
portions of Cendehill genome selected from the group 
20 consisting of: a non- translated region, protein pl50, 
protein p90, protein C, protein El and protein E2 ; and 
wherein at least a part of cDNA in the infectious clone is 
cDNA for a rubella virus other than Cendehill. 



19. A method of producing rubella RNA comprising the step 
25 of transcribing the infectious clone of claim 18. 

20. Rubella RNA produced according to the method of 
claim 19. 

21. A method of producing a rubella virus comprising the 
steps of transfecting cells with RNA produced according to 
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clairr. 19, and recovering rucella virus frcrr. the cransfecied 
cells . 

22- A rubella virus coxprismg a qenome including a lirsi 
rortion which is equivalent to one or more ribonucleic 
5 acids selected from the group ccnsisting of: Cendehill 
non-translated RNA; Cendehill pl50 RNA; p90 RNA; C RNA; 
El RNA; E2 RNA; and wherein a second portion of the genome 
is equivalent to RNA of a rubella virus ether than 
Cendehill . 

LO 23 . The virus of claim 22 wherein the virus other than 
Cendehill is RA27/3 . 

24. The virus of claim 19 or 20 wherein the first portion 
IS all of the Cendehill 5' non- translated RNA, pl50 RNA, 
and p90 RNA. 

15 25. A Cendehill viral protein free of virus, selected from 
the group consisting of: plSO, p90, C, El and E2 , produced 
by expressing Cendehill cDNA encoding said protein from an 
expression vector. 

26. Rubella cDNA, RNA, or a rubella virus having one or 
20 more nucleotide substitutions selected from the group 

consisting of: 37-C; 55-G; 118-T{or)U; 358-C; 2829-A; 
3060-G; 3164-C; 3528-T(or)U; 4530-T(or)U; 6611-C; 6770-G; 
6 771 -G; 742 8 -T (or)U; 87 86 -G; 8788 -T (or)U; 8864 -A; 
9180-T{or)U; 9254-A; and 9741-T(or)U, wherein the aforesaid 
25 numbering of the nucleotide substitution is with reference 
to Appendix 1, and wherein said substitutions occur in the 
same context as shown in Appendix 1. 

27. A rubella cDNA, RNA or viral genome that encodes a 
rubella protein selected from the group of proteins 

30 consisting of: pl50/929/tyr ; pl50/l006/gly ; pl5 0 /104 l/his ; 
pl50/1162/val; p90/14 96 /lie ; C4 /pro ; C/87/gly; E2/306/val; 
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E2/413/ile; El/759/asp; El/785/mGt; El/890/leu; and, 
El/915/thr, wherein the aforesaid proteins are identified 
by reference to a strain- specif i c amino acid in Cendehill 
pol;;^^protein and wherein the strain- specif ic amino acid 
5 occurs m the same context as m the Cendehill polyprotein. 

26. Use of DNA incorporated into an expression vector 
according to claim 14 as a sub-unit vaccine. 

29. Use of DNA of claim 18 as a DNA vaccine. 
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Genomic RNA 



Non-structural genes 



Structural genes 



CAP 



P150 



P90 



Sub-genomic RNA 



CAP— C 



E2 



El 



Figure 1 



RV 3' end complement 
F1 S'-CGOGAATTG 1 1 I 1 1 M M I i I M M 111 CTATACAGCAACAGGT 

EcoRI 

RV 5' start 

F2 fi'-TCGAAGCT TATTTAGGTGACACTATA CAATGGAAGCTATCGGACCTCGCTTAGG- 
Hindl 1 1 SP6 

9 5TGCAGCGTTCGACGCAAACG- 2133-2153 

1 0 5'-TCCG AGTGCGGTTGCG ATC- 2243-2262 
16 5'-GCGTTCTTGATGTCGATATCGCG- 4410-4431 
1 8 5'-CTCACTGATGTCTACACGCAGATG- 528 1 -5763 
46 5'-CAACCACCTCGGGAATGC- 3241-3260 
125 5'-TAGTCTTCGGCGCTTGG- 5747-5763 
251 5'-TTTGCCAACGCCACGGC- 2603-261 6 

Figure 2 
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5' 



F2 



Pvul Nhe1 



Bgl2 




3' 



Fl 



Figure 3 



BamHI 



NTiel-Ascl-Munl-CIal-Bglll-MluI-Agel 



Smai 



pCL1921 



pCL1921 



Figure 4 
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ENERGY = -22.6 [initially -22.6] R3 
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NTR P150 P90 

II I i— rn I I 

OY N>G A>V T>l 
Y>H 



I Cendehill mutations designated by single letter code 



Figure 8 



III H I 



S>P T>G 



E2 1 1 HI 11 



Y Y 



A>V 



Y Y .Y 



N>D L>M H>L A>T 

I Cendehill Y Glycosylatlon site 

D Cendehill + M33 

Figure 9 
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